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Abstract 

This paper deals with the problem of finding, for a given graph and a given natural 
number k, a subgraph of k nodes with a maximum number of edges. This problem is known 
as the k-cluster problem and it is NP-hard on general graphs as well as on chordal graphs. In 
this paper, it is shown that the k-cluster problem is solvable in polynomial time on interval 
graphs. In particular, we present two polynomial time algorithms for the class of proper 
interval graphs and the class of general interval graphs, respectively. Both algorithms are 
based on a matrix representation for interval graphs. In contrast to representations used in 
most of the previous work, this matrix representation does not make use of the maximal 
cliques in the investigated graph. 
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1 Introduction 

A graph G is called an interval graph if its nodes can be assigned to intervals on the real line 
so that two nodes are adjacent in G if and only if their assigned intervals intersect. The set of 
intervals assigned to the nodes of G is called a realization of G. A proper interval graph is an 
interval graph that has an intersection model, in which no interval contains another one strictly. 
Interval and proper interval graphs have been studied extensively in the literature and several 
linear-time algorithms are known for their recognition mmm- They are important for their 
applications to scheduling problems, biology, VLSI circuit design, as well as to psychology and 
social sciences in general BM- 

The class of interval graphs is of major importance, while studying the complexity of several 
difficult optimization problems, which are solvable in polynomial time on them, but NP-hard in 
the general case. Some of these problems are the maximum clique [B], the maximum independent 
set [HI [7] , the Hamiltonian cycle and the Hamiltonian path [8] . 

This paper deals with the problem of finding, for a given graph and a given natural number 
k, a subgraph on k nodes and of maximum number of edges. This problem is called the k-cluster 
problem. Until now it is known that the k-cluster problem is NP-hard as a generalization of the 
maximum clique problem. It remains NP-hard, even when restricted to comparability graphs, 
as well as on bipartite graphs and chordal graphs [9]. On the other side, it has been proved that 
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there are polynomial algorithms for the k-cluster problem on cographs, as well as on fc-trees 
and split graphs [9]. Fnrthermore, it has been proved that the decision version of the k-clnster 
problem is solvable in polynomial time, when searching for fixed-density ^-subgraphs, while it 
remains NP-hard, when searching for a /c-subgraph with density at least / (k) = (k^) edges, 

for some e > 0 [lOj . Finally, there are also some other polynomial time algorithms designed for 
the k-cluster problem on some special classes of the proper interval graphs, e.g., of the graphs, 
whose clique graph is a simple path m- 

In the present work, it is proved that the k-cluster problem on proper interval graphs, as 
well as on the general class of interval graphs, is solvable in polynomial time and thus the 
corresponding open problem stated in [9] is answered. To this end, a matrix representation, 
which characterizes these classes of graphs, is used here. This representation does not use their 
maximal cliques, as the vast variety of the existing characterizations do. 

2 The interval graphs in the general case 

Without loss of generality, we may suppose that all intervals in a realization of an interval graph 
are closed, i.e. of the form [a, b]. However, this representation is too general. To this end, a more 
suitable interval representation form is presented in Definition [1] [12] . Recall that an interval 
graph can be recognized in linear time mm- In the following, suppose we are given a realization 
of an interval graph G on n nodes. 

Definition 1. A representation of n intervals, having the following properties, is called a Normal 
Interval Representation (NIR) form: 

1. all intervals are of the form [i,j), where 0 <i < j <n, 

2. exactly one interval begins at i, for every i G {0,1,..., n — 1}. 

Suppose we are given a realization of the interval graph G. It can be converted to another 
realization of the same graph, in which all 2n endpoints are distinct in the real line. This 
can be done simply by disturbing them sufficiently, so that the structure of the graph remains 
unchanged, under the condition that the relative order of the left endpoints of any two intervals 
is not being reversed. After that, the arbitrary closed interval [a,b] may be replaced by [a,b), 
since the intersection of any two intervals, if such occurs, is a non-trivial interval. In the sequel, 
any interval’s right endpoint may be moved to the next greater interval’s left endpoint in the 
current realization, resulting thus in exactly n -|- 1 distinct endpoints altogether. Finally, all 
these endpoints may be moved bijectively to the points 0,1, ...,n, obtaining thus an NIR form 
of G in linear time 0(n). 

Lemma 1. An arbitrary graph is an interval graph iff it can be represented by the NIR form. 

Proof. An NIR form is clearly a set of intervals and thus it corresponds to an interval graph. 
Conversely, since any interval graph can be represented by an NIR form, this representation 
holds as a characterization of interval graphs. □ 

Since no two intervals in the NIR form share a common left endpoint, it is possible to define 
a perfect order over them. Let the interval be [i — l,b). Now recall the Heaviside function: 

if X > 0 
otherwise 
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Definition 2. Consider the interval [i — 1,6) of the NIR form of the interval graph G, for 
which we define the quantity xt := b — i. Then, the square matrix 


Hg {i,j) 


H {xj +j-i), iii> j 
0, otherwise 


is called the Normal Interval Representation (NIR) matrix of G. 

In the above definition the quantity Xi equals the number of intervals among the 
(i + 1)* ones that intersect with the one. Hq is a lower triangular matrix with 

zero diagonal, having a chain of Xi consecutive I’s under the diagonal element and all the 
remaining matrix entries being zero. It can be seen also as the lower triangular portion of the 
adjacency matrix of G, where however rows and columns are ordered in a particular way. Specif¬ 


shows an example of the form of Hg- 

Denote further the desired fe-subgraph of G with the maximum number of edges as G^. Join 
the variable Zi € {0,1} to the i^^ interval. The case Zi = 1 indicates that the node of G, i.e. 
the interval of its NIR form, is included in G^. Let now 1 < j < i < n. The and the 
intervals intersect in G^ if and only if the quantity Zj ■ Zi ■ H [xj + j — i) € {0,1} equals one. 
Indeed, in this case both intervals have been chosen in G^, i.e. Zi = Zj = 1 and, simultaneously, 
the interval ends strictly further than i — 1, where the one begins, i.e. H {xj + j — i) = 1. 
Thus, the number of intersections among the k intervals of the realization of Gk equals 

E TI » ^2 — 1 rp 

-1 +-^’“ ^ -^G-^ (1) 

I—Z J — i 

where z = [ zi Z 2 ■ ■ ■ Zn ]'^ and Hg is the NIR matrix of G. 

Since Ck has exactly k nodes, exactly k entries of the vector z are one. Thus, the k-cluster 
problem on G is equivalent to finding the appropriate subset I C {l,2,...,n} of the satisfied 
entries of 2 , with |/| = k, so that the following quantity is maximized: 


ically, the z*" interval of G is represented schematically by the z*" column of Hg- Figure 1(a) 


i>j i>j 

Lemma 2. Any maximal elique of G corresponds hijectively to a row of its NIR matrix Hg, in 
which at least one of its unit elements or its zero diagonal element does not have any chain of 
1 ’s below it. 


Proof. Consider an arbitrary row of Hg, let it be the z*^ one, in which exactly the z^^, i^, ...,i(f 
elements equal one. Clearly, the z*^ and the intervals intersect for every j G {zi, Z 2 ,..., z^}, 
since HG{i,j) = 1- The 4^, 4^,..., z*^ intervals of G intersect each other also, due to the NIR 
form of Hg- Thus, the if^,i^, ...,i^^,i^^ intervals build a clique Q in G. Consider now the case 
that in this row at least one of its ii^,i^, ...,il^,i^^ elements, say the one, does not have 
any chain of I’s below it. Suppose also that there exists another clique Q' in G, which strictly 
includes Q. Since Hg (-^i, j) = Hg (z, I 2 ) = 0 for every £i > i and £2 G {1, 2,..., z} \ {zi, Z 2 ,..., Zr}, 
the £^i and the as well as the z*^ and the £^2 intervals, do not intersect. Therefore, Q' can 
not be a clique, which is a contradiction. Thus, Q is a maximal clique. 

Conversely, let Q be a maximal clique in G, which contains the 4^) * 2 ^) •••D|q| intervals of 
its NIR form, where zi < Z 2 < ... < i\Q\. Consider now the Z|q| row of Hg- Since Q is a 
clique, the 4^) 4^) •••D|q|_i intervals intersect with the z|Qj one and therefore Hg {i\Q\,j') = 1 
for every j G {zi, Z 2 , •••, *|Qj-i}- Suppose Z|Qj < n. Then, if iLc (*|q| + 1, j) = 1 for every 
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j € {zi,^ 2 ) •••,^iQj}) the *|q|+i row corresponds to another clique Q' that includes Q strictly, 
which is a contradiction. Thus, at least one of the 4^, 4^,..., i|Q| elements of the z|g| row does 
not have any chain of I’s below it. Finally, in the case where Z|q| = n, obviously none of the 
elements of the Z|g| has any chain of I’s below it. □ 

3 The proper interval graph case 

Consider now the case that G is a proper interval graph. Since G is also an interval graph, it 
can be represented by the NIR form, which however has an additional property, as described in 
Definition [3l 

Definition 3. An NIR form of n intervals is called a Stair Normal Interval Representation 
(SNIR) form, iff it has the following additional property: 

If for the intervals [a, b) and [c,d), a < c holds, then b < d also holds. 

Lemma 3. An arbitrary proper interval graph G can be converted to the SNIR form. 

Proof. Suppose we are given an arbitrary realization of G, in which no interval contains another 
strictly. Consider the case that in this realization the left endpoint of the interval vi = [a, b] is 
strictly less than the left endpoint of the interval V 2 = [c, d], i.e., a < c. Then the same also 
do their right endpoints respectively, i.e., b < d, since otherwise V 2 would strictly include vi, 
which is a contradiction. Since G is also an interval graph, it can be converted to the NIR 
form, as described above. Suppose that vi and V 2 are converted to the intervals v'^ = [a', b') and 
V 2 = [c', d') in the resulting NIR form respectively. Then, a' < d holds, since the relative order 
of the interval left points a and c is not being reversed during the conversion of G to the NIR 
form; also H < (P holds, since the right endpoints b and d may be “aligned” by the left interval 
endpoints of the graph. Thus, the obtained NIR form satisfies the condition of Definition [3l i.e., 
it is an SNIR form. Note that in the special case of two initially identical intervals, i.e., a = c 

and h = d, we obtain the same right endpoints b' = d' for them in the resulting NIR form, while 

their left endpoints are ordered by increasing order, i.e., in this case the obtained NIR form is 
also an SNIR form. □ 

Definition 4. The NIR matrix Hq that corresponds to the SNIR form of a proper interval 
graph G is called the Stair Normal Interval Representation (SNIR) matrix of G. 

Definition 5. Consider the SNIR matrix Hq of the proper interval graph G. The matrix 
element Hdi,)) is called a pick of Hg, iff: 

1- i > j, 

2. if z > j then HdiJ) = 1, 

3. Hcii, k) = 0, for every k G {1, 2, ...,j — 1} and 

4. HciPj) = 0, for every £ G {z + 1, z + 2,..., n}. 

Given the pick HG{i,j) of Hq, the set 

S := {Hg {k, £) : i > k > i > j} 

of matrix entries is called the stair of Hg, which corresponds to this particular pick. 
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Recall that the left and the right endpoints of the interval in the SNIR form of G corre¬ 
spond to the and the {xi + i) elements of the column of Hq respectively. Therefore, due 
to Definition [3l it holds that Xi + i > Xj + j for i > j. Consequently, any stair of Hq consists 
of unit matrix elements, except of the diagonal elements of Hq, while the corresponding pick is 
the lower most left matrix entry of this stair. As it is seen in Figure [l(b)[ the SNIR matrix Hq 
has a stair-shape and equals the union of all its stairs. A stair of Hq can be also recognized in 
this figure, where the corresponding pick is marked with a circle. 
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Figure 1: (a) The NIR matrix Hq of an interval graph G, (b) The SNIR matrix Hqi of a proper 
interval graph G'. 


Lemma 4. An arbitrary graph is a proper interval graph iff it can be represented by the SNIR 
form. 

Proof. Due to Lemma [3l any proper interval graph can be represented by the SNIR form. 
Conversely, the SNIR form is clearly a set of intervals, where no one of which includes strictly 
another one, i.e., it is a realization of a proper interval graph. □ 

Lemma 5. Any stair of the SNIR matrix Hq corresponds bijectively to a maximal clique in G. 

Proof. Due to Lemma [2j every maximal clique of G corresponds bijectively to a row of Hq, 
in which at least one of its unit elements or its zero diagonal element does not have any chain 
of I’s below it. However, since G is a proper interval graph and due to Definition O such a 
row corresponds bijectively to a pick of Hq and therefore to a stair of it, as it is shown in 
Figure [T(b)| □ 

4 The k-cluster problem on proper interval graphs 

Due to Lemma 01 a proper interval graph G is equivalent to an SNIR matrix Hq. Denote 
by 51,52, ...,Sm, m < n — 1, the stairs of Hq, numbered from the top to the bottom. Due to 
Lemma[5]these stairs correspond bijectively to the maximal cliques Qi, Q 2 , ..., Qm, of G. Denote 
for simplicity 5o := 0 and Qo := 0. Every stair Si constitutes together with its previous stairs 
5i, S 2 ,..., 5i_i a submatrix Hi := Hci of Hq that is equivalent to the subgraph Gi := [Si=i Qi 
of G, which remains also a proper interval graph. In particular, = Hq is equivalent to 
Gm = G. We develop further a dynamic programming algorithm for the j-cluster problem on 
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Gi, which makes use of the optimal solutions of the q-cluster problems on Gi-i, for g = 1,2,j. 
The critical observation here is that the arbitrary stair Si of Hq contains at least one row 
that does not belong to the previous stair Si-i, i.e. Si \ Si-i ^ 0 and therefore Qi \ Qi-i ^ 0. 
Suppose that the pick of Si is the matrix element Hq (ai, bi). Then, the maximal clique Qi has 
\Qi\ = Qi — bi + 1 nodes, namely the bf^, {bi + 1)* , af^ ones. 

Denote now by /* {j,x,x') the value of an optimal solution of the j-cluster problem on G,, 
including exactly x nodes of the clique Qi \ Qi-i and exactly x' nodes of the clique Qi n Qi-\. 
Clearly, 0 < x < \Qi\ Qi-i\-, 0 < x' < |Qi fl Qi-i\ and x + x' < j. Then, the value of an optimal 
solution of the j-cluster problem on Gi is /* (j) = max {/j {j, x, x')}. Note that obviously for the 

x^x' 

j-cluster problem on a single stair Hi = Si we should require that x' = 0 and x = j, as also that 
Qi has at least j nodes, since otherwise we should include also j — x > 0 nodes of Qo = 0; which 
is a contradiction. Therefore, the following initial conditions hold for i = 1 and j = 1, 2,..., k: 


fi {j, x,0) = < 



K-oo, 


if X = j < \Qi 
otherwise 


(3) 


If j < \Qi\, then any subclique of Qi on j nodes is clearly an optimal solution. Otherwise, 
consider the case j > \Qi\. The recursive computation of fi {j,x,x'), which is presented below, 
makes use of the values fi-i {q, r, r') for (7 = 1, 2, where x = \Qi \ Qi-i\, x' = \Qi n Qi-i|, 
r = \Qi-i \Qi- 2 | andr' = \Qi-i f^Qi- 2 \■ We distinguish the cases ( 5 inQj _2 7 ^ 0 andQinQi _2 = 
0, or equivalently Si n Si -2 7 ^ 0 and Si n Si -2 = 0- In the case Qi fl Qi -2 7 ^ 0 an optimal solution 
may include y nodes of Qi-i \ Qi- 2 , z nodes of Qi fl Qi- 2 , w nodes of Qi-i \ Qi and u nodes of 
the remaining part of Gi. In the opposite case Qi n Qi -2 = 0, an optimal solution may include 
y nodes of Qi fl Qi-i, z nodes of Qi-i \ {Qi U Qi- 2 ), w nodes of Qi-i n Qi -2 and u nodes of the 
remaining part of Gi. Both situations are illustrated in Figure [2j As it can be easily verified, 
for all these sets the following hold: 


Case Qi n Qi -2 / 0 : 


Case Qi n Qi -2 = 0 : 


0 ^ X ^ Xq . \Qi \ Qi —1| 
— 0-2 1 

0<y <yi-= \Qi-i \ Qi -2 

= Cli-l — <Xi-2 
0 < z < Zi := \Qi r\ Qi- 2 \ 

= O.i-2 — bi + 1 

0 < w < wi := \Qi-i \ QJ 

= bi- bi-i 

0 < u < ui := bi-i — 1 


0 ^ X ^ Xq \Qi \ 11 

— fl-j ^i—1 

0 < y < 2/2 := IQi n Qi-i\ 

= Q-i-i — bi + 1 

0<z<Z2:= \Qi-i \ {Qi U Qi-2)\ 

— bi ai — 2 1 

<W 2 '.= \Qi-l n Qi- 2 \ 

= CLi-2 ~ bi-l 1 

0 < n < U 2 := bi-l — 1 


(4) 


The case Qir]Qi -2 7 ^ 0 occurs exactly when bi < ai- 2 , i.e. H ( 0^-2 — bi) = 1, while the opposite 
case Qi n Qi -2 = 0 occurs exactly when H {bi — ai -2 “ 1) = 1- Thus, since x, y, z, w and u add 
up to j, we can summarize the relations in (jl]) to the following, for the general case: 


0 

< 

X 

< 

Xq 





0 

< 

y 

< 

yi ■ 

H {ai-2 - 

bi) + y2 ■ 

H {bi - ai-2 - 

-1) 

0 

< 

z 

< 

zi ■ 

H {ai-2 - 

bi) + Z2 ■ 

H {bi - ai-2 - 

-1) 

0 

< 

w 

< 

Wl 

■ H {ai-2 - 

- bi) + W2 

■ H {bi - ai-2 

-1) 

0 

< 

u 

< 

Ul 

■ H {ai-2 - 

bi) -k U2 ■ 

■ H {bi - ai-2 ■ 

-1) 

X 


y 

-k 

z + 

w + u = j 
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(a) (b) 

Figure 2; The split of the SNIR matrix Hq for the recursion of the k-cluster problem on a proper 
interval graph G, in the cases (a) Si fl Si -2 7^ 0 and (b) Si fl Si -2 = 0- 

For simplicity, let Ci = z-H{a ^_2 — h) and C 2 = z-H { bi — ai -2 — 1) • Now, the value fi (j, x, x') 
can be computed by using the top-down approach of the following equation, for both cases 

Qi n Qi-2 / 0 and Qi n Qi-2 = 0: 


iix + y+Ci =j< \Qi\ 

( 6 ) 

otherwise 

Finally, the dynamic programming Algorithm 1 returns the value of an optimal solution of the 
k-cluster problem on G. After applying some necessary modifications, it will return the optimal 
solution, instead of its value. 

Algorithm Proper-Interval-k-cluster problem(G): 

Input: An arbitrary realization of a proper interval graph G 
Output: The value of an optimal solution of the k-cluster problem on G 

1. Construct the SNIR matrix Hq- Let that Hg has the m stairs 5 i, 52 , ■■■,Sm that corre¬ 
spond to the maximal cliques Qi,Q 2 , ■■■, Qm of G 

2. If m = 1 Then Return /i(fc) = /i {k,k,0), computed from (j3]) : 

Else Return fjn{k) = max{/m {k, x,x') : 0 < x < \Qi \ Qi-i \ ,0 < x < \Qi fl Qi-i \ ,x + x'< 
k}, computed from l| 6 |) 

Algorithm 1: The value of an optimal solution of the k-cluster problem on the proper interval 
graph G. 

Theorem 1. The k-cluster problem is solvable in O {nk^^ time on proper interval graphs. 


fi U,x,y+Ci) = < 


J 

2 


max { fi-i {j - x, y+C2, Ci + w) 

v.z.w.uG^Ol 


+ ( 2 J +^iy+Ci) h 
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Proof. The computation of a single fi (j) in the Algorithm 1 takes at most O = O (/c^) time 
due to the combinations of the x, y, z, w, u, such that they sum up to j, since x, y, z and w may 
vary and u = j — x — y — z — w\s then uniquely determined by them. Every /j {j) is computed 
for all i G {1,2, m} and j G {1, 2, k], i.e., altogether at most m ■ k = O {nk) quantities are 
computed. Thus, since any proper interval graph can be recognized and converted to the SNIR 
form in linear time, the k-cluster problem can be solved in O (nk^^ time on any proper interval 
graph. □ 

Note that in the presented analysis the subgraph that corresponds to the obtained optimal 
solution is not necessarily connected. Lemma [6] proposes a modification to the Algorithm 1, in 
order to find an optimal solution, under the additional constraint of connectivity. 

Lemma 6. The Algorithm 1 returns the value of an optimal solution of the k-eluster problem on 
proper interval graphs, under the additional constraint of eonnectivity, if the following additional 
eondition to is required: 

y + Ci > 1, i/x>0. (7) 

After this modification, the runtime of the proposed algorithm remains O [nk^). 

Proof. The proof is done by induction. If i = 1, then the obtained solution is always connected, 
as an induced subgraph of a clique. Suppose now that i > 1 and x > 0. It follows that we use 
X > 1 nodes of Qi, which are not included in Qj, for any j < i. Therefore, in order to construct 
a connected subgraph, it is equivalent to require that at least one node of QiCiGi-i = QiCiQi-i 
is included, i.e., a node which is simultaneously connected to the x nodes of Qi \ Qi-i and to 
at least one node of the remaining graph Gi-i. However, as described above, we include in the 
constructed subgraph exactly y + z nodes of Qi fl Qi-i if Qi fl Qi -2 ^ 0 and exactly y nodes 
of Qi n Qi-i if Qi n Qi -2 = 0- Namely, we include exactly y + Ci nodes of Qi fl Qi-i in the 
general case. Therefore, in order to construct a connected subgraph, it is equivalent to require 
that y + Ci > 1- Finally, the asymptotic complexity of the proposed algorithm remains obviously 
unchanged, when requiring the additional condition ([7]) to the conditions dS}. □ 

5 The k-cluster problem on interval graphs 

In this section we propose a polynomial dynamic programming algorithm for the k-cluster prob¬ 
lem on interval graphs, whose complexity status was an open question [9]. The proposed algo¬ 
rithm constitutes a generalization of Algorithm 1 for proper interval graphs. Due to Lemma [H 
an interval graph G is equivalent to a NIR matrix He- In the following consider an interval 
graph G on n nodes, as well as its NIR matrix Hq. 

Due to Lemma[2]any maximal clique of G corresponds bijectively to a row of the NIR matrix 
Hg, in which at least one of its unit elements or its zero diagonal element does not have any 
chain of I’s below it. The maximal clique, which refers to such a row, contains all intervals, 
i.e. nodes, which correspond to the unit elements and the zero diagonal element of this row. 
Denote these maximal cliques oi G hy Qi,Q 2 ,..., Qm, m < n — 1, numbered from the top to the 
bottom, as well as Qq := 0 . Suppose also that the maximal clique Qi occurs at the a^^ row of 
PIg and denote by \Qi\ the number of nodes of Qi. It holds clearly that Qi \ Qi-i / 0 for all 
z = 1,2, ...,m. Every maximal clique Qi constitutes together with its previous maximal cliques 
Qi, Q 2 ,..., Qi-i a subgraph Gi of G, which remains also an interval graph. Similarly to Section 
0] for the proper interval graphs, we develop further a dynamic programming algorithm for the 
j-cluster problem on Gi, which makes use of the optimal solutions of the q-cluster problems on 
Gi-i, for g = 1,2, ...,j. 


An optimal solution may include y nodes of {Qi H Qi-i) \ Qi- 2 , z nodes of Qi-i \ {Q^UQi- 2 ), 
w nodes of Qt fl (5i_2) ^ nodes of {Qi-i n Qj_ 2 ) \ Qi and v nodes of the remaining part of Gi, 
as it is illustrated in Figure [3l We compute in Appendix!^ the split of the NIR matrix Hq and 
we obtain the following relations for the variables x, y, z, w, u and v: 

0 ^ X ^ \Qi \ 1| — 1 

0<y < \{Qir\Qi-i)\Qi-2\ 

— Z]£=ai_2+1 H {£ + — tti) 

^<z<\Qi-l\{Q^l^Qi-2)\ 

= tti-i - ai -2 - E“=ai_ 2+1 H {e + Xi- ai) 

0 <w <\Qir Qi_ 2 \ = Y.i=i H {i + xt- ai) (8) 

0<u< |(Qi_inQi_2)\Q*| 

= E"=iH{i + Xi - ai_i) -Hiai-e-xe-l) 

0 < V < ai-2 - H {£ + xe- ai) - 

- Ei=i H{£ + xe- ai-i) ■ H {a^ - £ - xe - 1) 
x+y+z+w+u+v=j 

Now, the value fi {j, x, x') can be computed by using the top-down approach of the following 
equation: 


fi U, X, y+w) = < 


J 

2 


max \ fi_i {j - X, y+z, w + u) 

V.Z.W.U.Vplot 


+ ( 2 J +xiy+w) 


if X -h y+w = j < \Qi 


otherwise 


(9) 


Finally, the dynamic programming Algorithm 2, similarly to Algorithm 1, returns the value of 
an optimal solution of the k-cluster problem on G. After applying some necessary modifications, 
it will return the optimal solution, instead of its value. 


Algorithm Interval-k-cluster problem(G): 

Input: An arbitrary realization of an interval graph G 

Output: The value of an optimal solution of the k-cluster problem on G 

1. Construct the NIR matrix Hq- Let that G has the m maximal cliques Qi,Q2, ■■■■, Qm 

2. If m = 1 Then Return fi{k) = /i (A:, fc, 0), computed from (j3|): 

Else Return fm{k) = max{/m (A:, x,x') : 0 < x < \Qi \ Qi-i\ ,0 < x < \Qi fl Qi-i\ ,x + x < 
A:}, computed from Q 

Algorithm 2: The value of an optimal solution of the k-cluster problem on the interval graph 

G. 


Theorem 2. The k-cluster problem is solvable in O {nkf') time on interval graphs. 

Proof. The computation of a single fi (j) in the Algorithm 2 takes at most O (j®) = O (jc’) time 
due to the combinations of the x,y, z,w,u,v, such that they sum up to j, since x,y,z,w and 
u may vary and v = j — x — y — z — w — u is then uniquely determined by them. Every fi (j) 
is computed for all i € {1, 2,..., m} and j € {1, 2,..., A:}, i.e., altogether at most m ■ k = O {nk) 
quantities are computed. Thus, since any interval graph can be recognized and converted to the 
NIR form in linear time, the k-cluster problem can be solved in O {nkf) time on any interval 
graph. □ 
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Figure 3: The split of the NIR matrix Hq for the recursion of the k-cluster problem on an 
interval graph G. 

Lemma 7. The proposed algorithm returns the value of an optimal solution of the k-cluster prob¬ 
lem on interval graphs, under the additional constraint of connectivity, if the following additional 
condition is required to the conditions m- 

y-\-w>l, ifx>0. (10) 

After this modification, the runtime of the proposed algorithm remains O {nkf’^. 

Proof. The proof is done by induction. If i = 1, then the obtained solution is always connected, 
as an induced subgraph of a clique. Suppose now that i > 1 and x > 0. It follows that we use 
X > 1 nodes of Qi, which are not included in Qj, for any j < i. Therefore, in order to construct 
a connected subgraph, it is equivalent to require that at least one node of Qi n Gj_i = Qif) Qi-i 
is included, i.e., a node which is simultaneously connected to the x nodes of Qi \ Qi-i and to 
at least one node of the remaining graph Gi-i. However, as described above, we include in 
the constructed subgraph exactly y -\- w nodes of Qi H Qi-i- Therefore, in order to construct a 
connected subgraph, it is equivalent to require that y-\-w > 1. Finally, the asymptotic complexity 
of the proposed algorithm remains obviously unchanged, when requiring the additional condition 
(frU)) to the conditions ([5]). □ 

6 Conclusions 

In this paper an efficient matrix representation that characterizes the interval graphs, as well 
as its restriction on the proper interval graphs is used, which leads to a simple polynomial time 
algorithm for the k-cluster problem on these classes of graphs. This problem is known to be 
NP-hard on an arbitrary graph, as a generalization of the maximum clique problem, as well as 
on the chordal graphs. In contrary, its complexity on interval and proper interval graphs was 
an open question. 
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A The split of the NIR matrix Hq 

We remind at first that it is assumed that the maximal clique Qi occurs at the row of Hq, 
for £ = 1 , 2 ,..., m. Suppose that Oj-i < £ < Oj. If the chain of I’s under the £*^ diagonal element 
of Hq stops at a row, which is higher than the af^ one, then a maximal clique would occur 
between Qi-i and Qi, which is a contradiction. Thus, the chain under the £*^ diagonal element 
stops either at the row, or even lower. Suppose now that £ < Oj-i. If £ G Qi, then also 
£ G Qi-i, since the chain under the £*^ diagonal element stops either at the row, or even 
lower, i.e. strictly lower than the row. Therefore, the elements of Qi \ Qi-i are exactly the 
(oj-i + 1)*^ ,..., diagonal elements. Thus, 

\Qi \ Qi— 1 | ®i ®i —1 (H) 

In order to compute the value \{Qi r\ Qi-i) \Qi- 2 \, we have to compute how many of the 
1^^ 2”^^,..., af!ii diagonal elements belong to Qi and to Qi-i, but not to Qi- 2 - For 1 < £ < aj_i, 
the arbitrary £*^ diagonal element belongs to Qi exactly when its chain of I’s reaches the af^ 
row, i.e. exactly when i + xi > ai, or equivalently H {i + — Oj) = 1. In this case, it belongs 

also to Qi-i, since Oj-i < a^. Further, for I < £ < ai- 2 , if H {£ + Xi — ai) = 1 then the £*^ 
diagonal element belongs also to Qi -2 and therefore not to {Qi fl Qi-i) \ Qi- 2 - It follows that 

|(Q,nQ,_i)\Q,_ 2 | = H{£ + xe-ai) (12) 

^-^£=ai-2+l 

Now, the sets {Qi n Qi-i) \ Qi-2 and Qi-i \ {Qi U Qi-2) partition the set Qi-i \ Qi-2, which 
has Qi-i — ai-2 nodes, due to (fTT]) . Thus, it follows from (fT^ that 

|Qi_i \ (Qi U Qi_2)| = ai_i - ai_2 - ^ ^ H{i + xe-ai) (13) 

^ — ‘'£=ai-2+l 

In order to compute the value |Qi n Qi- 2 |, we have to compute how many of the 1^^ 2"''^,..., af!i2 
diagonal elements belong simultaneously to Qi -2 and to Qi. For 1 < £ < ai- 2 , the £*^ one 
belongs to Qi exactly when its chain of I’s reaches the af^ row, i.e. exactly when £ + X£ > Oi, or 
equivalently H {£ + Xi — ai) = 1. In this case, if £ 7 ^ ai-2, then its chain reaches also the af !_2 
row, which means that it belongs also to Qi-2, while the af!i2 oae belongs always to Qi-2- It 
follows that 

IQi n Qi- 2 | = ^ H{£ + Xi- aj) (14) 

Similarly, in order to compute the value |(Qi-i fl Qi- 2 ) \ Qil, we have to compute how many 
of the 1 ^*, 2 ’^'^,..., af)i 2 diagonal elements belong simultaneously to Qi_i and to Qi _2 but not to 
Qi. For 1 < £ < ai- 2 , the £*^ one belongs to Qi_i exactly when H {i + X£ — ai-i) = I. In this 
case it belongs also to Qi- 2 , since ai -2 < o-i-i- Further, it does not belong to Qi exactly when 
£ + xr < Oi, or equivalently H {ai — i — xe — f) = f- It follows that 

|(Qi_i n Qi_ 2 ) \ Qil = ^ H {i + Xi- ai-i) - H {ai - £ - Xi - 1 ) (15) 

Finally, the complementary part in Gi of the sets in (llll) - (|15p has 

ai-2 - H{£ + xi- ai) - ^ H {£ + xe- ai_i) - H {ai - £ - xe - 1 ) (16) 

nodes, since Gi has overall ai nodes. 
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